Classifying Papers from Different Computer Science Conferences

نویسندگان

  • Yaakov HaCohen-Kerner
  • Avi Rosenfeld
  • Maor Tzidkani
  • Daniel Nisim Cohen
چکیده

This paper analyzes what stylistic characteristics differentiate different styles of writing, and specifically types of different A-level computer science articles. To do so, we compared various full papers using stylistic feature sets and a supervised machine learning method. We report on the success of this approach in identifying papers from the last 6 years of the following three conferences: SIGIR, ACL, and AAMAS. This approach achieves high accuracy results of 95.86%, 97.04%, 93.22%, and 92.14% for the following four classification experiments: (1) SIGIR / ACL, (2) SIGIR / AAMAS, (3) ACL / AAMAS, and (4) SIGIR / ACL / AAMAS, respectively. The Part of Speech (PoS) and the Orthographic sets were superior to all others and have been found as key components in different types of writing.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Peer-Selected “Best Papers”—Are They Really That “Good”?

BACKGROUND Peer evaluation is the cornerstone of science evaluation. In this paper, we analyze whether or not a form of peer evaluation, the pre-publication selection of the best papers in Computer Science (CS) conferences, is better than random, when considering future citations received by the papers. METHODS Considering 12 conferences (for several years), we collected the citation counts f...

متن کامل

چهار دهه فعالیت علمی ایران از منظر مقالات همایش‌ها، مقالات پر استناد و داغ و مقالات دسترسی آزاد با نگاهی به قانون برنامه توسعه اقتصادی ، اجتماعی، فرهنگی کشور

This study aims to investigate Iran scientific production Pre-revolutionary by 2016 with the emphasis on the conferences proceedings, highly cited and hot papers, and open access papers, in the light of the Law of Economic, Social, and Cultural Development Plan of Iran. Descriptive – analytical method used. To achieve research objectives data extracted from Clarivate Analytics (Thomson Reuters)...

متن کامل

Algorithmic Detection of Computer Generated Text

Computer generated academic papers have been used to expose a lack of thorough human review at several computer science conferences. We assess the problem of classifying such documents. After identifying and evaluating several quantifiable features of academic papers, we apply methods from machine learning to build a binary classifier. In tests with two hundred papers, the resulting classifier ...

متن کامل

The skewness of computer science

Computer science is a relatively young discipline combining science, engineering, and mathematics. The main flavors of computer science research involve the theoretical development of conceptual models for the different aspects of computing and the more applicative building of software artifacts and assessment of their properties. In the computer science publication culture, conferences are an ...

متن کامل

Markov Topic Models

We develop Markov topic models (MTMs), a novel family of generative probabilistic models that can learn topics simultaneously from multiple corpora, such as papers from different conferences. We apply Gaussian (Markov) random fields to model the correlations of different corpora. MTMs capture both the internal topic structure within each corpus and the relationships between topics across the co...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013